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Abstract 

Background: Spiders liave evolved pharmacologically complex venoms that serve to rapidly subdue prey and 
deter predators. The major toxic factors in most spider venoms are small, disulfide-rich peptides. While there is 
abundant evidence that snake venoms evolved by recruitment of genes encoding normal body proteins followed 
by extensive gene duplication accompanied by explosive structural and functional diversification, the evolutionary 
trajectory of spider-venom peptides is less clear. 

Results: Here we present evidence of a spider-toxin superfamily encoding a high degree of sequence and 
functional diversity that has evolved via accelerated duplication and diversification of a single ancestral gene. The 
peptides within this toxin superfamily are translated as prepropeptides that are posttranslationally processed to yield 
the mature toxin. The N-terminal signal sequence, as well as the protease recognition site at the junction of the 
propeptide and mature toxin are conserved, whereas the remainder of the propeptide and mature toxin sequences 
are variable. All toxin transcripts within this superfamily exhibit a striking cysteine codon bias. We show that 
different pharmacological classes of toxins within this peptide superfamily evolved under different evolutionary 
selection pressures. 

Conclusions: Overall, this study reinforces the hypothesis that spiders use a combinatorial peptide library strategy 
to evolve a complex cocktail of peptide toxins that target neuronal receptors and ion channels in prey and 
predators. We show that the oo-hexatoxins that target insect voltage-gated calcium channels evolved under the 
influence of positive Darwinian selection in an episodic fashion, whereas the K-hexatoxins that target insect 
calcium-activated potassium channels appear to be under negative selection. A majority of the diversifying sites in 
the oo-hexatoxins are concentrated on the molecular surface of the toxins, thereby facilitating neofunctionalisation 
leading to new toxin pharmacology. 
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Background 

Venoms have proven to be key evolutionary innovations 
for many divergent animal lineages [1,2]. Although the 
most extensively studied venoms are from the medically 
important scorpions, snakes, and spiders, venom sys- 
tems are present in many other lineages including 
cnidarians, echinoderms, molluscs, fish, lizards, and 
mammals [1,2]. These venoms have evolved to serve a 
variety of purposes, including prey capture, competitor 
deterrence, and defense against predators. There has 
been considerable innovation both in the chemical 
composition of these venoms as well as the method of 
venom delivery, which includes barbs, beaks, fangs, 
harpoons, nematocysts, pinchers, proboscises, spines, 
spurs, and stingers [1,2]. 

From a molecular evolutionary perspective, the venoms 
of snakes are the best understood. There is now abundant 
evidence that snake venoms evolved by recruitment of 
genes encoding normal body proteins followed by exten- 
sive duplication, neofunctionalization, and in some in- 
stances relegation to the status of pseudogene [1,3-5]. In 
many cases, these genes have been explosively replicated to 



produce large multigene families. This process is analogous 
to the birth-and-death model of evolution proposed for 
multigene families involved in adaptive immunity, such as 
the major histocompatibility complex and immunoglobulin 
genes [6]. However, the evolutionary trajectory is less 
clear for the venoms of spiders, scorpions, and molluscs, 
which are dominated by disulfide-rich peptides of mass 
2-9 kDa [7-12]. These peptides typically possess high affin- 
ity and often-exquisite specificity for particular classes of 
ion channels and other nervous system targets [13-15]. 
These neurotoxic ftinctions are perhaps not surprising 
given that the primary role of these venoms is to paralyse 
or kill envenomated prey [11,16,17]. 

In this study, we analysed toxin-encoding transcripts 
from five species of Australian ftinnel-web spider (Aranae: 
Mygalomorphae: Hexathelidae: Atracinae) from the genera 
Atrax and Hadronyche, representing a geographic spread 
of more than 2000 km (Figure 1), in order to provide 
insight into the evolutionary trajectory of the co-hexatoxin- 
1 (co-HXTX-l) family. co-Hexatoxins (formerly known as 
(o-atracotoxins) are peptides comprising -37 residues that 
were first isolated from the venom of the lethal Blue 
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Figure 1 Distribution, venom collection and venom-gland dissection of Australian funnel-web spider species used in this study. (A) 

Map of the eastern half of Australia showing the distribution of the five species of Australian funnel-web spider used in this study. (B) Female 
funnel-web spider {Hodronyche infenso) from Fraser Island, QLD. In response to provocation, the spider has adopted a typical aggressive/defensive 
posture, with front legs and pedipalps raised and the fangs in an elevated position ready to strike. Note the drop of venom on each of the fang 
tips. (C) A single H. versuta venom gland that has been dissected from the surrounding muscle tissue. The venom gland in these and other 
mygalomorph spiders is located directly below the dorsal surface of the chelicerae. 
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Mountains funnel- web spider Hadronyche versuta [18]. 
The co-hexatoxins are major components in the venom of 
Australian funnel-web spiders [18-20] and they contribute 
significantly to prey immobilization by virtue of their ability 
to specifically block insect, but not vertebrate, voltage- 
gated calcium (Cay) channels [17,18,20-22]. Their po- 
tent insecticidal activity has engendered interest in these 
peptides as bioinsecticides [11,17,23]. Proteomic analysis 
of K versuta venom revealed a number of co-HXTX-Hvla 
paralogs [19], suggesting that this peptide toxin might be- 
long to a multigene family. However, because the venom 
used in this previous study was pooled from several spi- 
ders, it was unclear whether these apparent paralogs are 
simply polymorphisms resulting from allelic variation. 
By using cDNA libraries obtained from a single spider, 
we demonstrate here that co-HXTX-Hvla is indeed part 
of a large multigene family that appears to have arisen 
from explosive gene duplication followed by extensive 
sequence divergence and neofunctionalization. Within 
this superfamily of toxins, we show that pharmacologic- 
ally distinct toxin classes are evolving under starkly 
different selection pressures, with some toxin classes 
accumulating variation under episodic bursts of adapta- 
tion, while others remain constrained by negative selec- 
tion. This work reinforces the idea that the chemical 
and pharmacological diversity present in spider venoms 
may have evolved from a relatively small number of an- 
cestral genes. 



Results and discussion 

The u)-hexatoxlns are expressed as prepropeptide 
precursors 

RACE analysis was used to amplify transcripts encoding 
orthologs of (o-HXTX-Hvla from four species of Australian 
funnel-web spider: Atrax robustus, H, infensa, H, venenata, 
and H, versuta (Figure 2). Multiple co-HXTX-Hvla ortho- 
logs were identified in each species (i.e., 24 paralogs encod- 
ing seven distinct mature toxins were identified in H. 
infensa, 18 paralogs encoding six mature toxins were identi- 
fied in the Sydney funnel-web spider A. rohustus, and eight 
paralogs encoding two mature toxins identified in the 
Tasmanian funnel-web spider H. venenata) (Figure 3A). 
A further eight paralogs encoding four distinct mature 
toxins were identified in the venom-gland transcriptome 
of H. modesta (Figure 3A). Thus, the amino acid sequence 
diversity previously reported for co-HXTX-l based on ana- 
lysis of pooled venom samples [19] is due to expression of 
multiple related transcripts in a single spider rather than 
allelic variation. The almost complete conservation of the 
signal sequence, as well as the pattern of conserved cyste- 
ines in the mature toxin (Figure 3A), indicates that these 
(0-HXTX-Hvla homologs arose by duplication and se- 
quence divergence of the original toxin-encoding gene. 

All of the co-HXTXs are expressed as prepropeptide 
precursors that are posttranslationally processed to yield 
the mature toxin sequence (Figure 2A). The highly hydro- 
phobic 22-residue signal sequence is of similar length to 
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Figure 2 Schematic representation of toxin precursors, overall RACE amplification strategy and identification of toxin superfamilies. 

(A) Schematic representation of a typical spider peptide precursor sliowing tine signal peptide in orange, the propeptide in purple, and the 
mature toxin in black. After translation, the signal and propeptide regions are proteolytically removed to yield a functional mature toxin. (B) 
General overview of the RACE protocol for sequencing hexatoxin transcripts. Adaptors are added to the 5' and 3' end of transcripts during cDNA 
library preparation. In both 3' and 5' RACE, gene-specific primers are used in the forward (3' RACE) or reverse (5' RACE) orientation to amplify 
full-length sequences. The resulting PCR products are then cloned and sequenced. (C) Schematic representation of the Shiva superfamily 
highlighting the combinatorial nature of spider-venom peptides. 
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Figure 3 Sequence alignments of w and k -hiexatoxins. (A) Sequence alignment of oo-HXTX-Hvla paralogs from each species: Hodronyche 
modesto (Hmola-Hmold), Atrox robustus (Arla-Arlg), Hodronyche infenso (Hila-Hilg), and Hodronyche venenoto (Hvnla and Hvnlb). The level of 
residue conservation is graded from black (fully conserved across all paralogs) to dark grey (conserved in most toxins) to light grey (conserved in 
a majority of orthologs). The lines below the sequence alignment indicate the disulfide-bond connectivity of oo-HXTX-Hvla. (B) Alignment of 
K-HXTX-Hvla paralogs from Hodronyche versuto and one ortholog from Hodronyche modesto. The level of residue conservation is graded as in 
panel (A) and the signal peptide, propeptide, and mature toxin regions are highlighted. The lines below the sequence alignment indicate 
the disulfide-bond connectivity of k-HXTX-Hv1c, with the vicinal disulfide bond highlighted in red. Note that oo-HXTX-Arla (UniProt PF06357), 
oo-HXTX-Hila (UniProt P0C2L5), oo-HXTX-Hil b (UniProt P0C2L6), oo-HXTX-Hilc (UniProt P0C2L7), and k-HXTX-HvIc (UniProt P82228) have been 
previously isolated directly from venom. 



that reported for peptide-toxin precursors from spiders, 
scorpions, and cone snails [8]. The 20-residue propeptide 
sequence is highly acidic, with a net charge of -4, a feature 
that has been noted for numerous spider-toxin propeptide 
sequences [24-28] but which is not characteristic of toxin 
precursors from other venomous animals. Moreover, 
the presence of a propeptide region contrasts with most 
scorpion-toxin precursors in which the signal sequence 
is fused directly to the mature toxin without an inter- 
vening propeptide [8]. The reason for the highly acidic 
propeptide region in spider toxin precursors remains to 
be determined, but it may be related to specific interac- 
tions between the toxin precursor and components of 
the secretory and/or protein folding pathway in spider 
venom glands. 

The propeptide sequence terminates with a dibasic Arg- 
Arg signature; dibasic sequences are common recognition 
sites for proteolytic removal of propeptide segments in 
neuropeptide precursors from both vertebrates [29] and 
invertebrates [30]. While Arg is the terminal residue in 
virtually all known spider-toxin propeptide sequences, the 
penultimate residue is variable, though it is commonly 
Asp, Glu, or Lys [25,26,31-33]. 

(o-hexatoxins belong to a large toxin-gene superfamily 

Orthologs of (o-HXTX-Hvla were identified in all five 
species of Australian funnel-web spider examined in this 



study. These species are distributed along the eastern 
seaboard of Australia with a geographic spread of more 
than 2000 km (Figure 1). The hexathelids are a group of 
approximately 40 species divided into three genera: Atrax, 
Hadronyche and Illawarra [34-36]. They are adapted to 
forest environments but can also be found in habitats that 
range from montane herblands and open woodland to 
closed forest [36]. Conservation of the co-hexatoxin family 
of toxins over this wide range of environments and differ- 
ing prey distributions implies that there has been strong 
evolutionary pressure to maintain these peptides as part of 
the venom arsenal, which is perhaps not surprising given 
that they are broadly active against many different arthro- 
pods [11,17,37]. 

In addition to obvious homologs of co-HXTX-Hvla, 
the RACE and transcriptomic analyses revealed add- 
itional families of toxins that had almost identical signal 
sequences to the co-HXTX-l transcripts, but divergent 
propeptide and mature toxin sequences. We named one 
of these families the co/k-HXTX family (described as U- 
ACTX in [38]). The co/k-HXTX peptides appear to be dis- 
tributed in two of the species examined {A, robustus and 
H, versuta); this reinforces the idea that these toxins most 
likely arose ancestrally by duplication of a co-HXTX-l 
gene followed by hypermutation of the propeptide and 
mature-toxin regions in order to create a new function 
(neofunctionalization). The conservation and radiation of 
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these toxins across this family of spiders implies that they 
are not nonfunctional relics of an explosive radiation of 
this toxin-gene superfamily, and we confirmed this by 
showing that recombinant co/K-HXTX-Hvla is highly 
insecticidal [38]. The high insecticidal potency of this 
family of peptides is believed to result from a synergistic 
effect on insect voltage-gated calcium (Cay) channels 
and calcium-activated potassium (Kca) channels [38]. 

RACE analysis of the venom-gland cDNA library from 
H. versuta also led to amplification of transcripts encoding 
the insecticidal toxin k-HXTX-HvIc [39], and sequencing 
of the venom-gland transcriptome from H, modesta also 
uncovered an ortholog of this toxin (Figure 3B). This was 
entirely unexpected since this toxin has a vastly differ- 
ent primary structure to co-HXTX-Hvla [39]. Moreover, 
in addition to the six conserved cysteine residues in co- 
HXTX-Hvla that form an inhibitor cystine knot (ICK) 
motif [40,41], K-HXTX-Hvlc contains two additional 
cysteine residues that form an extremely rare vicinal disul- 
fide bond [42-44]. Furthermore, in contrast to co-HXTX- 
Hvla, which blocks insect Cay channels, k-HXTX-Hv1c is 
a potent and specific blocker of Kca channels [45] . Never- 
theless, the near identity of the signal sequence in these 
two toxin families and the conservation of cysteine resi- 
dues in the mature toxin indicate that they evolved from 
the same ancestral toxin gene and are members of the 
same gene superfamily. 

We did not find orthologs of k-HXTX-HvIc in any of 
the other three species of Australian funnel-web spider {H, 
infensa, A, robustus, and H, venenata). However, k-HXTX- 
Hvla, K-HXTX-Hvlb, and k-HXTX-HvIc are expressed at 
very low levels in H. versuta venom [42], and conse- 
quently we cannot rule out the possibility that these 
toxins are present in the venom of the other three spi- 
ders but the transcript levels are too low to be detected 
using the methods employed here. 

The Shiva superfamily of peptide toxins 

It has previously been suggested that superfamilies of 
spider-venom peptides evolved from a single ancestral 
gene via explosive gene duplication [8]; the work de- 
scribed here further supports this idea as it is clear that 
the co-HXTXs, co/k-HXTXs, and k-HXTXs belong to a 
large superfamily of toxins that arose via gene duplica- 
tion (Figure 2C). We have chosen to name spider-toxin 
gene superfamilies after deities of death and destruction 
since the major biological role of these toxins is to 
paralyze and/or kill envenomated prey. Accordingly, we 
have named the co/k-HXTX/co-HXTX/k-HXTX gene 
superfamily after the Hindu deity Shiva, commonly 
known as the "destroyer". 

Sequence logos were previously used to analyse differ- 
ences in the level of sequence conservation between the 
three parts of the co-HXTX toxin precursor, namely the 



signal peptide, propeptide, and mature toxin [8]. A re- 
vised logo analysis of the Shiva superfamily (Figure 4 A) 
that incorporated all of the new sequences and species 
reported here reinforced the dichotomy in evolutionary 
forces affecting various elements of the toxin precursor. 
The signal peptide has clearly been highly conserved 
throughout the evolution of this toxin superfamily and it is 
presumably under negative selection in order to ensure 
that these toxins are directed to the appropriate secretory 
pathway. In contrast, there is significant sequence variation 
in both the propeptide and mature toxin sequences, with 
two notable exceptions. First, in contrast to the highly 
variable upstream region of the propeptide sequence, the 
C-terminal proteolytic recognition signal (Arg-Arg) is 
completely preserved (Figure 4A). Presumably there has 
been strong selection pressure to ensure processing of the 
propeptide by a specific protease. Second, in contrast to 
the overall low level of conservation of the mature toxin 
sequence, the cysteine residues, which direct the three- 
dimensional (3D) fold of the toxins, are completely con- 
served (Figure 4A). The marked variation in levels of 
sequence conservation between the spider-toxin signal 
sequence and the propeptide and mature toxin regions 
is reminiscent of that observed for superfamilies of cone 
snail toxins [46-51]. 

There are two striking differences between the Shiva 
superfamily precursors and transcripts encoding human 
neuropeptides and other secreted proteins. First, whereas 
precursors of human neuropeptides often encode multiple 
mature neuropeptide sequences [29,52], we and others 
have not found any examples of spider-toxin transcripts 
that encode more than a single mature toxin sequence. 
Secondly, in direct contrast to the toxin precursors, the 
sequence of the mature human neuropeptide (s) is usually 
strongly conserved whereas there is significantly more 
variability in the signal sequence. This is perhaps not sur- 
prising given that human neuropeptides usually act on a 
single well-defined molecular target whereas spider toxins 
typically target a specific subtype of receptor or ion chan- 
nel that nevertheless might vary significantly in primary 
structure between prey taxa (Note that most spiders are 
generalist predators that target a phylogenetically diverse 
range of prey). Thus, expressing a family of related toxins 
in the venom (essentially a mini-combinatorial peptide li- 
brary) might ensure that the desired receptor /ion channel 
is targeted, regardless of prey taxa. 

Position-specific cysteine codon bias 

Mature co-HXTXs contain three disulfide bonds with 1- 
4, 2-5, 3-6 connectivity. These disulfides form an ICK 
motif that provides these toxins with a high degree of 
chemical, thermal and biological stability [53]. Although 
it is clear from a protein structure viewpoint why these 
six cysteine residues need to be strictly maintained in 
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Figure 4 Sequence logo and codon usage analysis from the Shiva superfamily. (A) Sequence logo [54] based on alignment of 
prepropeptides from the Shiva superfamily. There is a much higher level of sequence conservation within the signal peptide than within the 
propeptide and mature toxin regions. Note, however, that the cysteine residues that form the cystine-knot motif in the mature toxin and the 
Arg-Arg protease recognition site that terminates the propeptide region are both completely conserved (highlighted in blue and red, 
respectively). (B) Codon usage for the six-cysteine residues that form the cystine-knot motif. Note the strong bias for TGC at cysteine positions 1, 
3, 4, and 6. Shown above the histogram is the disulfide bridge arrangement for the six cysteines as inferred from the 3D structures of 
co-HXTX-Hvla and k-HXTX-Hv1c. 

V ) 



order to preserve the toxin s 3D scaffold, one would not 
expect to find a preference for either one of the two pos- 
sible cysteine codons (TGT and TGC). Intriguingly, 
however, previous analysis of co-HXTX precursors re- 
vealed a strong bias for TGC at four of the six cysteine 
positions in the mature toxin region [8]. An extended 
logo analysis [54] incorporating all of the newly discovered 
sequences reported in this study corroborated the previ- 
ously observed codon bias (Figure 4B). We found an ex- 
treme TGC codon bias for the four cysteine residues that 
form the 1-4 and 3-6 disulfide bridges in the co-HXTX 
family but not for the two cysteines that form the 2-5 



disulfide bond (Figure 4B). The observed position-specific 
codon bias is not simply a manifestation of global codon 
bias in these spiders as we have observed a preference for 
TGT as opposed to TGC for cysteine residues in other 
hexatoxin superfamilies (data not shown). Moreover, we 
did not observe extreme codon bias for any other con- 
served residue in the mature hexatoxins. 

Position-specific cysteine-codon bias has also been ob- 
served in superfamilies of cone snail toxins and it has 
been proposed that these codons might serve as attrac- 
tants for a mutator complex that includes a poorly pro- 
cessive and highly mutagenic polymerase (e.g., DNA Pol 
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V) that promotes radiation of the toxin superfamily by 
facilitating hypermutation of the mature toxin region 
[49,50]. However, there is currently no direct evidence 
that cysteine-codon bias plays a part in directing the 
evolution of spider or cone snail toxins. 

Molecular evolution analyses 

We utilized various state-of-art molecular evolutionary 
assessment methods to determine the influence of nat- 
ural selection on the evolution of genes encoding Shiva 
superfamily toxins (see Methods section for full details 
of the selection analyses). The one-ratio model, the sim- 
plest of the codon-specific models, estimated the non- 
synonymous-to-synonymous nucleotide-substitution rate 
ratio (co) to be 0.64, 1.06 and 0.69 for the co-HXTXs, 
K-HXTXs, and combined Shiva superfamily dataset, re- 
spectively (Additional file 1: Table SI -3). This highly 
conservative model can only detect positive selection 
when (0, averaged over all sites along the lineages in a 
phylogenetic tree, is significantly greater than one. As 
lineage-specific models of PAML, such as the one-ratio 
model, often fail to detect positive-Darwinian selection 
that only affects certain sites in proteins, we also employed 



site-specific models (Table 1: codon numbers based on k- 
HXTX-Hvlc_2 and co-HXTX-Arla_l; Additional file 1: 
Table Sl-3). Model 8 estimated co of 0.69, 1.06 and 0.78 
for the (o-HXTXs, k-HXTXs, and the combined Shiva 
superfamily dataset, respectively (Table 2 and Additional 
file 1: Table Sl-3). Although the computed co for the 
K-HXTXs was >1, the assessment was not statistically 
significant (p > 0.05) in comparison with the null model 
(M7 p). The Bayes Empirical Bayes (BEB) approach imple- 
mented in M8 was only able to identify one positively se- 
lected site in the combined toxin dataset (Table 2 and 
Additional file 1: Table S3). Thus, the site-specific models 
failed to detect the influence of adaptive selection pres- 
sures in shaping evolution of the Shiva superfamily. In 
contrast, the more advanced Fast, Unconstrained Bayesian 
AppRoximation (FUBAR) [55,56] implemented in HyPhy 
detected a handful of positively selected sites in both the 
o)-HXTXs and the combined dataset (Table 1). 

Site-specific models for detecting positive selection 
work best when detecting pervasive selection pressures. 
However, the majority of positively selected sites are 
often subjected to transient or episodic adaptations. When 
the majority of lineages evolve under the influence of 



Table 1 Nucleotide and complementary protein analyses for co toxins 



Site^ CodeML Tree SAAP Accessible 



Codon 


Amino Acid 


M2a'' 


M8^ 


Property'^ 


Magnitude^ 


surface area 


21 


E 


0.99 ± 0.38 


0.87 ± 0.42 












(0.201) 


(0.257) 








40 


V 


1.41 ±0.49 


1 .47 ± 0.37 




8, 8, 8, 8 


42.0 






(0.589) 


(0.842) 






Partially exposed 


44 


S 


1.25 ±0.37 


1.28 ±0.41 




8, 8, 8, 8 


82.1 






(0.420) 


(0.647) 






Exposed 


53 


H 


1 .62 ± 0.63 


1.55 ±0.36 






0.0 






(0.749) 


(0.929) 






Buried 


57 


G 


1 .39 ± 0.48 


1 .45 ± 0.38 






57.3 






(0.570) 


(0.825) 






Exposed 


60 


T 


0.80 ± 0.44 


0.71 ±0.41 






49.7 






(0.122) 


(0.154) 






Exposed 


64 


N 


0.53 ± 0.40 


0.51 ± 0.30 






100.0 






(0.031) 


(0.036) 






Exposed 


69 


T 


1 .34 ± 0.45 


1 .38 ± 0.40 






59.8 






(0.507) 


(0.751) 






Exposed 


72 


R 


1.10 ±0.33 


1.01 ±0.42 






0.0 






(0.26) 


(0.374) 






Buried 



^Sites detected as positively selected using the integrative approach. 

'^M2a Bayes empirical Bayes (BEB) posterior probability and post-mean w indicated in parentheses. 
''MS Bayes empirical Bayes (BEB) posterior probability and post-mean w indicated in parentheses. 

'^Amino acid property under selection {Mw- molecular weight; My: molecular volume; V^: partial specific volume; ju; Refractive index). 
^Magnitude if selection on the amino acid property. 

^Accessible surface area of 10-20% corresponds to buried residues, 40-50% indicates partially exposed amino acid residues, and >50% indicates solvent 
exposed residues. 
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Table 2 Molecular evolution of (u and k toxins from 
Australian funnel-web spiders 

FUBAR^ MEME'' PAML^ 









M8 


M2a 


03 


oo>l':3 


7 


0 


0 


toxins 


oo<l":5 




0 


0 








0.69 


0.73 


K 


00 >!': 1 


0 


0 


0 


toxins 


oo<l^: 0 




0 


0 








1.06^^ 


1.06^^ 


ALL 


00 3 


8 


1 


0 


toxins 


oo<l":7 




(0+1) 


0 








0.78 


0.83 



^Fast, Unconstrained BayesianApproximation (FUBAR). 

■^Sites detected as experiencing episodic diversifying selection (0.05 

significance) by mixed effects model evolution (MEME). 

'^Positively selected sites detected using the Bayes empirical approach 

implemented in the site models MS and IV12a. Number of positively selected 

sites detected at the posterior probability >0.99 and 0.95 are indicated in 

parenthesis, w computed using MS and M2a are also presented. 

^Number of sites evolving under the influence of pervasive diversifying 

selection, detected by FUBAR at 0.9 posterior probability. 

^ Number of sites evolving under the influence of pervasive purifying 

selection, detected by FUBAR at 0.9 posterior probability. 

w = mean dN/dS. 

NS = not significant at 0.05 compared to the null model (M7: beta). 

negative selection, they mask the signal of positive selec- 
tion that influences only a small number of lineages. In 
such scenarios, the aforementioned analyses may fail to 
detect the influence of positive selection. To address the 
shortcomings of the aforementioned approaches, we 
employed the advanced Mixed Effects Model Evolution 
(MEME) [57], which uses fixed effects likelihood (PEL) 
along the sites and random effects likelihood (REL) 
across the branches to detect episodic diversifying selec- 
tion. MEME is capable of identifying both pervasive and 
episodic adaptations. MEME identified 7 and 8 episodic- 
ally diversif)^ing sites in the co-HXTXs and combined 
toxin dataset, respectively (Table 2), highlighting the 
vital role of episodic diversifying selection in shaping 
the evolution of these spider toxins. Six out of eight epi- 
sodically diversifying sites (75%) were located on the 
molecular surface of the toxins (Table 1 and Figure 5B) 
with their side chains completely or partially exposed to 
solvent, suggesting that they could act as pharmaco- 
logical sites and participate in prey envenomation; these 
findings are also in agreement with the selection forces 
found on the surface of the SGTx toxin family from the 
venom of the African Baboon spider Scodra grisiepies 
[58]. Rapid Accumulation of Variations in Exposed Resi- 
dues (RAVER), where the toxin molecular chemistry 
undergoes hypervariations under the influence of posi- 
tive Darwinian selection and focal mutagenesis [59], has 
been documented in a plethora of venom-components 
from a wide diversity of venomous animal lineages 



[59-64]. Since the synthesis and secretion of venom pro- 
teins is energetically expensive [65-67], mutations that 
disrupt the structure/function of proteins are filtered 
out of the population by negative selection over time, fa- 
voring the conservation of catalytic and structurally im- 
portant residues. RAVER not only aids in generation of 
a rapidly variable toxin molecular surface biochemistry, 
but it also ensures the conservation of structurally and 
functionally important residues. Accumulation of varia- 
tions on the molecular surface of the toxin is advanta- 
geous as the altered surface chemistry might lead to 
new toxin functions (neofunctionalisation). 

To derive further support for the positively selected sites 
detected by nucleotide analyses, we employed a comple- 
mentary protein-level approach implemented in TreeSAAP 
(Table 1). TreeSAAP identified two positively selected sites 
in the co-HXTXs that were in common with the sites iden- 
tified by site-model 8 of PAML (Table 1). Evolutionary 
fingerprint analyses (Figure 5A) clearly revealed several res- 
idues in the co-HXTXs and the combined toxin dataset that 
evolve under the influence of positive selection, while a 
majority of residues in the k-HXTXs remained under evo- 
lutionary constraint (Figure 5A,B). Thus, evolution of the 
co-HXTXs has been significantly influenced by short bursts 
of episodic adaptations, while the k-HXTXs appear to be 
under negative selection. 

Phylogenetic analysis revealed that the k-HXTXs form 
a separate clade to the co-HXTXs, rendering the Shiva 
superfamily non-monophyletic (Figure 6). There are 
also significant variations within the co-HXTXs suggest- 
ive of functional diversification (Figure 6). The "hybrid" 
co/k-HXTXs exhibit functional characteristics of both 
the co-HXTXs and k-HXTXs as they block Cay channels 
(like the co-HXTXs) as well as Kca channels (like the k- 
HXTXs). The functional activity of the co/k-HXTXs 
combined with their relative phylogenetic placement and 
cysteine pattern indicates that they are structurally and 
functionally intermediate between the co- and k-HXTXs. 
The evolution of new cysteine residues to create the vici- 
nal disulfide bond in the k-HXTXs potentiated toxin 
activity on Kca channels, since mutagenesis and analogue 
studies indicate that this vicinal disulfide bond is the most 
critical part of the Kca pharmacophore [43,45,68]. 

Constraints on mutation of the mature toxin sequence 

It is generally considered that conservation of the cyst- 
eine scaffold in toxin-gene superfamilies is critical for 
conserving the toxins 3D fold [35]. However, the incred- 
ible disparity in the amino acid sequence between co- 
HXTX-Hvla and k-HXTX-Hv1c (Figure 7A) begs the 
question of whether this is reflected in a significant differ- 
ence in their 3D structures, despite their common cystine- 
knot scaffold. The 3D structure of both toxins has 
been determined previously using homonuclear NMR 
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Figure 5 Molecular evolution analyses of k- and w-HXTXs. (A) Evolutionary fingerprint of k- and cjo-HXTXs. Estimates of tine distribution of 
synonymous (a) and non-synonymous ((3) substitution rates inferred for tine k-HXTXs, co-HXTXs, and tine combined Sliiva superfamily dataset. Tlie 
ellipses reflect a Gaussian-approximated variance in each individual rate estimate, and colored pixels show the density of the posterior sample of 
the distribution for a given rate. The diagonal line represents the idealized neutral evolution regime (00 = 1), while points above and below the 
line correspond to positive selection (00 > 1) and negative selection (ax 1), respectively. The 00 for site model 8, along with the total number of 
positively selected sites detected by its Bayes Empirical Bayes (BEB) approach and the number of episodically diversifying sites detected by the 
mixed effects model of evolution (MEME), are also indicated. (B) Molecular evolution of Shiva superfamily toxins from Australian funnel-web 
spiders. 3D homology models are shown with their molecular surface colored according to the evolutionary conservation of amino acids (see 
color key); the location of positively selected sites is shown in red in space-fill models and as red spheres in wireframe models. A line plot is also 
provided to highlight the relative accumulation of dN versus dS, estimated using the MO model of PAML. NS: Not significant. 



spectroscopy [18,42] and their pharmacophores elucidated 
using alanine scanning mutagenesis [17,21,43,69]. 

Figure 7B and C show schematic representations of 
the 3D structure of k-HXTX-Hv1c and co-HXTX-Hvla, 
respectively. The two toxins can be considered to com- 
prise four inter-cystine loops, which are labelled 1-4 from 
N- to C-terminus. Although there is an obvious similarity 
in the disposition of the three centrally located disulfide 
bridges that form the cystine-knot motif in each toxin, the 
overall topology of the toxins, as well as the size and 
relative orientation of the four inter-cystine loops, ap- 
pears quite different. However, the structural overlay in 
Figure 7D, which was generated automatically by the 
DaliLite structural alignment program [70], reveals that 
the two structures are in fact remarkably similar. 

The DaliLite alignment yields a root mean square 
deviation of 2.4 A over the backbone atoms of the 28 
aligned residues, indicating that the two toxins are in- 
deed structural homologs. The three central disulfide 
bridges and loop 1 align remarkably well. Loop 4, which 
encompasses the p-hairpin present in both toxins, also 
aligns well except for the four-residue insertion in 
(o-HXTX-Hvla (Figure 7A), which increases the size of 
the hairpin loop at the tip of Loop 4. The major 



structural differences between the two toxins are the 
very different orientations of loops 2 and 3. However, 
these structural variations cannot disguise the fact that 
the two toxins essentially conform to the same 3D scaf- 
fold despite their extraordinary sequence divergence 
(16% identity if the cysteine framework is excluded). 
This ability to maintain a consistent molecular architec- 
ture despite massive variation in the inter-cystine loop 
sequences has important implications for the mechanism 
by which this superfamily of peptide toxins has evolved. 

Conclusions 

Spiders and other venomous animals rely on the produc- 
tion of pharmacologically complex venoms for defense, 
prey capture, and competitor deterrence. The major com- 
ponents of most spider venoms are disulfide-rich peptides 
that have evolved to target a wide range of receptors and 
ion channels in the insect nervous system. The co-HXTX 
and K-HXTX families were the first peptides isolated from 
Australian funnel-web spiders that were shown to be in- 
secticidal [17]. Analysis of all transcripts encoding these 
peptides showed that they are initially expressed as pre- 
propeptides that are proteolytically processed to yield a 
36-37 residue mature peptide that contains three disulfide 
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Figure 6 Bayesian phylogenetic tree representing thie molecular evolutionary history of the Shiva superfamily toxins. The tree shows 
the split between the three main toxin classes (00, k, and oo/k). oo-Actinopoditoxin-Mbla from the Eastern mouse spider Missulena bradleyi was 
used as the outgroup. Toxins belonging to each species are highlighted in the following colours: H. versuto, red; H. modesto, black; H. venenata, 
green; H. infensa, magenta; A. robustus, pale blue; H. fornnidabilis, dark blue; oo-actinopoditoxin-Mbla from M. bradleyi, orange, "denotes a species 
not sequenced as part of this study; the sequence was downloaded from UniProt under accession number P83588. 



bridges that form an ICK motif plus a non-canonical vici- 
nal disulfide bond in the k-HXTXs. 

The extreme diversity of primary structure within the 
Shiva toxin superfamily suggests that there have been 
few evolutionary restraints on sequence diversification 
outside of the disulfide bridges that direct the 3D fold of 
these peptides. The co-HXTXs, in particular, seem to 
have evolved under the influence of positive Darwinian 
selection in an episodic fashion, whereas the k-HXTXs 
appear to be constrained by negative selection pressures. 
Functional assessments of these toxins should shed fur- 
ther light on why they have adopted quite contrasting 
molecular evolutionary regimes. co-HXTXs were also 
found to have adopted RAVER, where a large number of 
the episodically diversifying sites are concentrated on 
the molecular surface, facilitating the generation of novel 



pharmacological sites. These toxins may therefore be good 
candidates for in vitro evolution studies designed to pro- 
duce modified peptides with desired therapeutic [14] or 
agrochemical [11] properties. Most importantly, this study 
reinforces the idea that the remarkable chemical and 
pharmacological complexity of spider venoms may be de- 
rived from a relatively small number of ancestral genes. 

Methods 

Identification of to-HXTX-l homologs via rapid 
amplification of cDNA ends 

Venom-gland cDNA libraries were prepared from indi- 
vidual specimens of the following species of Australian 
funnel-web spider (Arthropoda: Chelicerata: Arachnida: 
Araneae: Opisthothelae: Mygalomorphae: Hexathelidae): 
Hadronyche infensa ^ H, versuta, H, venenata, and Atrax 
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Figure 7 Structural comparisons of k-HXTX-Hv1c and w-HXTX-Hvla. (A) Comparison of the primary structures of k-HXTX-Hv1c and oo-HXTX- 
Hvla. Identities are boxed and sliaded orange wliile tine conserved cysteines tliat form tine cystine-l<not motif in eacli toxin are coloured red. 
(B) Richardson representation of the 3D structure of k-HXTX-Hv1c (PDB accession code IDLO; [42]) and (C) co-HXTX-Hvla (PDB accession code 
1AXH; [18]). Disulfide bonds are shown as orange tubes. The inter-cystine loops and the N- and C-termini are labelled. (D) Stereo-view of an 
overlay of the 3D structures of k-HXTX-Hv1c (dark blue with light-blue disulfide bonds) and oo-HXTX-Hvla (raspberry with light-salmon disulfide 
bonds). The structural alignment, which was automatically generated by DaliLite [70], yielded a root mean square deviation of 2.4 A over the 
backbone atoms of the 28 aligned residues (Alal-Alal 1, Pro15-Prol8, Ser21-Asn27, and Gly28-Arg33 in k-HXTX-Hv1c versus Pro2-Prol2, Asnl6- 
Serl9, Ser21-Asn27, and Thr32-Asp37). The inter-cystine loops and the N-terminus are labelled. Figures generated using MacPyMOL [91]. 



robustus, which were collected from geographically dis- 
tinct regions of Australia (Figure lA). Spiders were cooled 
to -20°C for 40-60 min, then paired venom glands were 
carefully dissected from each specimen (Figure IC). Each 
pair of venom glands was combined, then polyA + mRNA 
was extracted using a QuickPrep Micro mRNA Purifica- 
tion Kit (Amersham Pharmacia Biotech/GE Healthcare 



Life Sciences, Rydalmere, NSW, Australia) and stored at 
-20°C until further use. cDNA libraries were constructed 
using a Marathon cDNA Amplification Kit (Clontech 
Laboratories, Mount View, CA, USA). From the mRNA 
template, single-stranded cDNA was constructed using 
Superscript III reverse transcriptase (Life Technologies, 
Grand Island, NY, USA) and a poly(dT) anchor primer 
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(Echoclonanch-2, GGGCAGGT17). Second-strand synthe- 
sis was performed according to the kit specifications ex- 
cept the cDNA was purified using a Concert Rapid PGR 
Purification kit (Gibco/Life Technologies) instead of a 
phenol/chloroform extraction. A Marathon cDNA Ampli- 
fication adaptor (Clontech Laboratories) was then ligated 
to the double-stranded cDNA. After overnight ligation, 
the sample was precipitated using 10 \A of 5% w/w glyco- 
gen, 10 (il of 3 M sodium acetate pH 5.2, and 100 \A of 
100% ethanol at -20°C. The sample was subsequently 
washed with 80% ethanol and dried for 10 min prior to re- 
suspension in Tris-EDTA buffer. 

Transcripts encoding co-HXTX-Hvla and paralogs/ 
orthologs thereof were subsequently obtained via rapid 
amplification of cDNA ends (RACE) (Figure 2B) [71]. A 
redundant 3' PGR primer based on residues 24-31 of 
(o-HXTX-Hvla (co-HVl 5' RTTNCCRTTYTCRTTYT 
CYTTRAA 3') was used in conjunction with a 5' univer- 
sal adaptor primer in a 5' RACE experiment designed to 
extract information about the upstream region of the 
G)-HXTX-Hvla transcript (EchoAPl: 5' CACCCCTAA 
TACGACTCACTATAGG 3'). A gene-specific primer for 
3' RACE was then designed based on the leader se- 
quence obtained from the 5' RACE experiment (3' RACE 
primer: 5' TGCTGCAATATGAATACCGC 3'. This pri- 
mer was used in combination with the Echoclonanch-2 
oligo(dT) primer (5' GGGCAGGTTTTTTTTTTTTTT 
TTT 3') to generate transcripts that encode a signal se- 
quence homologous to that of co-HXTX-Hvla. 

PGR products were extracted from agarose gels using 
a Gibco gel purification kit, precipitated using Pellet Paint 
Co-Precipitant kit (Novagen/EMD Millipore, Billerica, 
MA, USA), then phosphorylated with kinase in prepar- 
ation for cloning. PGR products were then ligated into 
pSMART and transformed into E, cloni cells (Lucigen, 
Middleton, WI, USA) using the Lucigen CloneSmart Blunt 
Cloning kit. Transformed clones were cultured for one 
hour in Recovery Medium, then plated with 50 (ig/mL 
ampicillin to allow for overnight growth. PGR screening 
was then used to select colonies with the expected insert 
size for DNA sequencing. DNA sequences (and the cor- 
responding protein sequences) were collated and ana- 
lysed using Geneious Pro, version 3.8.5 [72] and signal 
sequence cleavage sites were predicted using SignalP, 
version 3.0 [73]. 

Identification of w-HXTX-l homologs via transcriptomics 

Paired venom glands from Hadronyche modesta were 
dissected out and pooled. Total RNA was extracted 
using the standard TRIzol® Plus method (Invitrogen/Life 
Technologies) according to the manufacturer s protocol. 
One microgram of Total RNA was used to construct a 
cDNA library using the CREATOR™ SMART™ cDNA li- 
brary construction kit (Clontech Laboratories) following 



the manufacturer s protocol. Briefly, total RNA was re- 
versed transcribed using the SMART™ Moloney Murine 
Leukemia virus (MMLV) reverse transcriptase. Second- 
strand synthesis was completed using long distance poly- 
merase chain reaction (PGR) as follows: 1 min at 95°C 
followed by 20 cycles of 1 min at 95°C and 6 min at 68°C. 
Products were then digested and size fractionated using a 
CHROMA SPIN-400 DEPC-H2O column (Clontech La- 
boratories) and then ligated into the pDNR-lib donor 
vector. Recombinant plasmids were electroporated into E- 
shot™ DHB10™-T1^ electro competent cells (Invitrogen/ 
Life Technologies). 384 clones were randomly selected and 
sequenced by capillary electrophoresis on an Applied Bio- 
systems 3730x1 DNA analyzer (Applied Biosystems/Life 
Technologies) at the Brisbane node of the Australian Gen- 
ome Research Facility (AGRF). Sequences were processed 
so vector and polyA + tails were clipped using CLC Main 
Work Bench (CLC-Bio), and the Blast2GO bioinformatic 
suite [74,75] was used to provide Gene Ontology, BLAST 
and domain/Interpro annotation. Signal sequence cleavage 
sites were predicted using SignalP, version 3.0 [73]. 

Nomenclature 

In accordance with the recently introduced systematic 
nomenclature for naming peptide toxins from venomous 
animals [76], co-ACTX-Hvla [18] and J-ACTX-Hvlc [39] 
have been renamed co-HXTX-Hvla and K-HXTX-Hvla, 
respectively, and the various paralogs and orthologs un- 
covered in this study have been named accordingly. 
Briefly, the Greek letter denotes the molecular target of 
the peptide, followed by the generic name indicating the 
family from which the toxin is derived; in this case the ab- 
breviation is HXTX for hexatoxin. After the generic family 
name, a two-letter abbreviation is used to denote the 
genus and species, indicated by upper and lowercase let- 
ters respectively (i.e., Hv for H, versuta, Hi for H, infensa, 
etc.). The name of the species is immediately followed by 
a numeral that helps to distinguish different toxins with 
similar pharmacology and this number is followed by let- 
ter that denotes the paralog number (this is based on the 
number of different encoded mature toxin sequences). 

Molecular evolution analyses 

A total of 73 nucleotide and 90 peptide sequences were 
aligned using the default settings in Geneious Pro, ver- 
sion 3.8.5 [72] then manually adjusted for optimal align- 
ment prior to the foUowing molecular evolution analyses 
(see Additional file 1: Figure SI and S2). 

Test for recombination 

To overcome the effects of recombination on the phylo- 
genetic and evolutionary interpretations [77], we employed 
Single Breakpoint algorithms implemented in the HyPhy 
package and assessed the effect of recombination on all the 
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toxin forms examined in this study [78,79] . When potential 
breakpoints were detected using the small sample Akaike 
information criterion (AIC), the sequences were compart- 
mentalized or partitioned before conducting selection ana- 
lyses to allow the recombining units to have distinct 
phylogenies (as described in [80,81]). 

Selection analyses 

We evaluated the influence of natural selection on the 
toxins using maximum-likelihood models [82,83] imple- 
mented in CODEML of the PAML software [84]. We 
employed site-specific models that estimate positive selec- 
tion statistically as an co value significantly greater than 1. 
We compared likelihood values for three pairs of models 
with different assumed co distributions as no a priori ex- 
pectation exists for co: MO (constant co rates across all 
sites) versus M3 (allows co to vary across sites within n 
discrete categories, where n > 3); Mia (a model of neutral 
evolution) where all sites are assumed to be either under 
negative (co <1) or neutral selection (co = 1) versus M2a (a 
model of positive selection) which in addition to the site 
classes mentioned for Mia, assumes a third category of 
sites; sites with co >1 (positive selection) and M7 (p) versus 
M8 (P and co), and models that mirror the evolutionary 
constraints of Ml and M2 but assume that co values are 
drawn from a p distribution [85]. Only if the alternative 
models (M3, M2a and M8: allow sites with co >1) show a 
better fit in a Likelihood Ratio Test (LRT) relative to their 
nuU models (MO, Mia and M7: do not allow sites co >1), 
are their results considered significant. LRT is estimated 
as twice the difference in maximum likelihood values 
between nested models and compared with the distri- 
bution with the appropriate degree of freedom— the differ- 
ence in the number of parameters between the two 
models. The Bayes empirical Bayes (BEB) approach [86] 
was used to identify amino acids under positive selection 
by calculating the posterior probabilities that a particular 
amino acid belongs to a given selection class (neutral, 
conserved, or highly variable). Sites with greater posterior 
probability (PP > 95%) of belonging to the w > 1 class' 
were inferred to be positively selected. 

FUBAR [55-57] implemented in HyPhy [78,79] was 
employed to detect sites evolving under positive and nega- 
tive selection. MEME [57], which is designed to overcome 
the drawbacks of site-specific assessments, was used to 
detect episodic diversifying selection. Mutations were also 
assessed via a complementary protein-level approach im- 
plemented in TreeSAAP [87]. An evolutionary fingerprint 
analysis was carried out using the ESD algorithm imple- 
mented in datamonkey [78,79,88] in order to clearly de- 
pict the proportion of sites under different regimes of 
selection. 

Logo plots [54] showing cysteine codon bias were con- 
structed using Geneious software, version 5.4. 



Shiva superfamily phylogenetic tree 

The molecular evolutionary history of the Shiva super- 
family toxins was reconstructed using Bayesian inference 
as implemented in MrBayes version 3.2.1 [89], using Iset 
rates = invgamma with the prset aamodelpr = mixed com- 
mand, which enables the program to optimize between 
the nine different amino acid substitution matrices imple- 
mented in MrBayes. WAG [90] was chosen as the best 
substitution matrix by the program. Tree searches were 
run using four Markov chains for a minimum of 10 mil- 
lion generations, sampling every 100th tree. The log likeli- 
hood score of each saved tree was plotted against the 
number of generations to establish the point at which the 
log-likelihood scores of the analyses reached their asymp- 
tote. 25% of the total trees sampled were discarded as bur- 
nin. The posterior probabilities for clades were established 
by constructing a majority rule consensus tree for aU trees 
generated after completion of the burnin. The tree was 
rooted using the sequence of co-actinopoditoxin-Mbla 
(also known as co-missulenatoxin-Mbla) from the Eastern 
mouse spider Missulena bradleyi as an outgroup; this 
toxin also blocks Cay channels and it has sequence 
homology to the co-HXTXs, including conservation of 
the co-HXTX-Hvla pharmacophore. However, although 
M, bradleyi and Australian funnel-web spiders (family 
HexatheUdae) both belong to the infraorder Mygalo- 
morphae, M, bradleyi is a member of the Actinopodidae 
family. 

Structural alignment of omega and kappa hexatoxins 

Atomic coordinates for co-HXTX-Hvla [18] and k-HXTX- 
Hvlc [42] were downloaded from the Protein DataBank 
(PDB accession codes lAXH and IDLO, respectively). A 
structural alignment of the toxins was automatically gen- 
erated using DaliLite [70]. All structure figures were gen- 
erated using MacPyMOL [91]. The Consurf webserver 
[92] was used for mapping evolutionary selection pres- 
sures on 3D homology models. 

Availability of supporting data 

Nucleic acid and protein sequence alignments and their 
respective accession numbers can be accessed from the 
supplementary material along with tables relevant to the 
molecular evolution analyses. 

Additional file 



Additional file 1: Diversification of a single ancestral gene into a 
successful toxin superfamily in highly venomous Australian 
funnel-web spiders. 
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